Psweep: Parallel View Maintenance under Concurrent Data Updates of Distributed Sources Psweep: Parallel View Maintenance under Concurrent Data Updates of Distributed Sources
نویسندگان
چکیده
Data warehouses (DW) are built by gathering information from several information sources (ISs) and integrating it into one repository customized to users' needs. Recent work has begun to address the problem of view maintenance of DWs under concurrent data updates of diierent ISs. SWEEP proposed by Agrawal et al. AAS97] is one of the more popular solutions; even though its performance is limited due to enforcing a sequential ordering on the handling of data updates from ISs by the view maintenance module. We have overcome this limitation by developing a parallel algorithm for view maintenance, called PSWEEP, that still incorporates all beneets of SWEEP while ooering substantially improved performance. In order to perform parallel view maintenance, we solve two issues: detecting maintenance-concurrent data updates in a parallel mode, and correcting the problem that the DW commit order may not correspond to the DW update processing order due to parallel maintenance handling. By decomposing SWEEP into an architecture of modular components, we then can insert a local timestamp assignment module for detecting maintenance-concurrent data updates without requiring any global clock synchronization. We introduce the negative counter concept as a simple yet suucient solution to solve the Variant-DW-Commit problem of variant orders of committing eeects of data updates to the DW. We have proven the correctness of PSWEEP to guarantee that our strategy indeed generates the correct nal DW state. An evaluation of both SWEEP and PSWEEP is given that shows that PSWEEP has the potential of multi-fold performance improvement over SWEEP depending on the number of threads supportable in the given DW system implementation.
منابع مشابه
PVM: Parallel View Maintenance under Concurrent Data Updates of Distributed Sources
Data warehouses (DW) are built by gathering information from distributed information sources (ISs) and integrating it into one customized repository. In recent years, work has begun to address the problem of view maintenance of DWs under concurrent data updates of diierent ISs. Popular solutions such as ECA and Strobe achieve such concurrent maintenance however with the requirement of quiescenc...
متن کاملDetection and Correction of Connicting Source Updates for Materialized View Maintenance Detection and Correction of Connicting Source Updates for Materialized View Maintenance
Materialized views, often derived from several data sources, must be maintained under source changes. In a distributed context, autonomous source updates can be concurrent and thus cause erroneous maintenance results. State-of-the-art maintenance strategies issue maintenance queries to the sources and apply compensating queries to correct such errors. However, these solutions are limited to han...
متن کاملData Warehouse Maintenance under Concurrent Schema and Data Updates
Data warehouses (DW) are built by gathering information from several information sources and integrating it into one repository customized to users' needs. Recently proposed view maintenance algorithms tackle the problem of (concurrent) data updates happening at di erent autonomous ISs, whereas the EVE system addresses the maintenance of a data warehouse after schema changes of ISs. The concurr...
متن کاملWPI - CS - TR - 98 - 8 August 1998 Data Warehouse Maintenance Under Concurrent Schema
Data warehouses (DW) are built by gathering information from several information sources and integrating it into one repository customized to users' needs. Recently proposed view maintenance algorithms tackle the problem of (concurrent) data updates happening at diierent autonomous ISs, whereas the EVE system addresses the maintenance of a data warehouse after schema changes of ISs. The concurr...
متن کاملAn Architecture of a Data
We present incremental view maintenance algorithms for a data warehouse derived from multiple distributed autonomous data sources. We begin with a detailed framework for analyzing view maintenance algorithms for multiple data sources with concurrent updates. Earlier approaches for view maintenance in the presence of concurrent updates typically require two types of messages: one to compute the ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1999